Statistica Sinica 12(2002), 7-29 GENERALIZED ASSOCIATION PLOTS: INFORMATION VISUALIZATION VIA ITERATIVELY GENERATED CORRELATION MATRICES

نویسنده

  • Chun-Houh Chen
چکیده

Given a p-dimensional proximity matrix Dp p, a sequence of correlation matrices, R = (R; R; : : :), is iteratively formed from it. Here R is the correlation matrix of the original proximity matrix D and R is the correlation matrix of R , n > 1. This sequence was rst introduced by McQuitty (1968), Breiger, Boorman and Arabie (1975) developed an algorithm, CONCOR, based on their rediscovery of its convergence. The sequence R often converges to a matrix R (1) whose elements are +1 or 1. This special pattern of R partitions the p objects into two disjoint groups and so can be recursively applied to generate a divisive hierarchical clustering tree. While convergence is itself useful, we are more concerned with what happens before convergence. Prior to convergence, we note a rank reduction property with elliptical structure: when the rank of R reaches two, the column vectors of R fall on an ellipse in a two-dimensional subspace. The unique order of relative positions for the p points on the ellipse can be used to solve seriation problems such as the reordering of a Robinson matrix. A software package, Generalized Association Plots (GAP), is developed which utilizes computer graphics to retrieve important information hidden in the data or proximity matrices.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Selecting the Number of Change-points in Segmented Line Regression.

Segmented line regression has been used in many applications, and the problem of estimating the number of change-points in segmented line regression has been discussed in Kim et al. (2000). This paper studies asymptotic properties of the number of change-points selected by the permutation procedure of Kim et al. (2000). This procedure is based on a sequential application of likelihood ratio typ...

متن کامل

Generalized Double Pareto Shrinkage.

We propose a generalized double Pareto prior for Bayesian shrinkage estimation and inferences in linear models. The prior can be obtained via a scale mixture of Laplace or normal distributions, forming a bridge between the Laplace and Normal-Jeffreys' priors. While it has a spike at zero like the Laplace density, it also has a Student's t-like tail behavior. Bayesian computation is straightforw...

متن کامل

Power and Sample Size Calculations for Generalized Estimating Equations via Local Asymptotics.

We consider the problem of calculating power and sample size for tests based on generalized estimating equations (GEE), that arise in studies involving clustered or correlated data (e.g., longitudinal studies and sibling studies). Previous approaches approximate the power of such tests using the asymptotic behavior of the test statistics under fixed alternatives. We develop a more accurate appr...

متن کامل

Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.

We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival...

متن کامل

New robust dynamic plots for regression mixture detection

The forward search is a powerful general method for detecting multiple masked outliers and for determining their effect on inferences about models fitted to data. From the monitoring of a series of statistics based on subsets of data of increasing size we obtain multiple views of any hidden structure. One of the problems of the forward search has always been the lack of an automatic link among ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002